Picture for Jiansheng Chen

Jiansheng Chen

PACT: Proactive Asking for Continual Task Assistance in Human-Robot Collaboration

Add code
May 23, 2026
Viaarxiv icon

MHSA: A Lightweight Framework for Mitigating Hallucinations via Steered Attention in LVLMs

Add code
May 14, 2026
Viaarxiv icon

Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images

Add code
Apr 21, 2026
Viaarxiv icon

Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling

Add code
Mar 26, 2026
Viaarxiv icon

Video-Only ToM: Enhancing Theory of Mind in Multimodal Large Language Models

Add code
Mar 25, 2026
Viaarxiv icon

Step-DeepResearch Technical Report

Add code
Dec 24, 2025
Viaarxiv icon

XYZCylinder: Feedforward Reconstruction for Driving Scenes Based on A Unified Cylinder Lifting Method

Add code
Oct 09, 2025
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Figure 1 for Step-Audio 2 Technical Report
Figure 2 for Step-Audio 2 Technical Report
Figure 3 for Step-Audio 2 Technical Report
Figure 4 for Step-Audio 2 Technical Report
Viaarxiv icon

From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models

Add code
Jun 17, 2025
Viaarxiv icon

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Figure 1 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 2 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 3 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 4 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Viaarxiv icon